Estimating a sparse reduction for general regression in high dimensions

نویسندگان

  • Tao Wang
  • Mengjie Chen
  • Hongyu Zhao
  • Lixing Zhu
چکیده

Although the concept of sufficient dimension reduction has been proposed for a long time, studies in the literature have largely focused on properties of estimators of dimension-reduction subspaces in the classical “small p, and large n” setting. Rather than the subspace, this paper considers directly the set of reduced predictors, which we believe are more relevant for subsequent analyses, and proposes a principled method for estimating a sparse reduction, which is based on a new representation of a well-known method called sliced inverse regression. A fast and efficient algorithm is developed for computing the estimator. The asymptotic behavior of the new method is studied when the number of predictors, p, exceeds the sample size, n, providing a guide for choosing the number of sufficient dimension-reduction predictors. Numerical results, including a simulation study and a cancer-drug sensitivity data analysis, are presented to examine the performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

A non-parametric conditional factor regression model for high-dimensional input and response

In this paper, we propose a non-parametric conditional factor regression (NCFR) model for domains with high-dimensional input and response. NCFR enhances linear regression in two ways: a) introducing low-dimensional latent factors leading to dimensionality reduction and b) integrating an Indian Buffet Process as a prior for the latent factors to derive unlimited sparse dimensions. Experimental ...

متن کامل

Mammalian Eye Gene Expression Using Support Vector Regression to Evaluate a Strategy for Detecting Human Eye Disease

Background and purpose: Machine learning is a class of modern and strong tools that can solve many important problems that nowadays humans may be faced with. Support vector regression (SVR) is a way to build a regression model which is an incredible member of the machine learning family. SVR has been proven to be an effective tool in real-value function estimation. As a supervised-learning appr...

متن کامل

A Total Ratio of Vegetation Index (TRVI) for Shrubs Sparse Cover Delineating in Open Woodland

Persian juniper and Pistachio are grown in low density in the rangelands of North-East of Iran. These rangelands are populated by evergreen conifers, which are widespread and present at low-density and sparse shrub of pistachio in Iran, that are not only environmentally but also genetically essential as seed sources for pistachio improvement in orchards. Rangelands offer excellent opportunities...

متن کامل

Pedotransfer functions for estimating soil moisture content using fractal parameters in Ardabil province

Extended abstract 1- Introduction Soil moisture curve is an important characteristic of soil and its measurement is necessary for determining soil available water content for plant, evapotranspiration and irrigation planning. Direct measurements of soil moisture coefficients are time-consuming and costly. But it is possible to estimate these characteristics from readily available soil propert...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics and Computing

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2018